state-of-the-art natural language processing
"What's my model inside of?": Exploring the role of environments for grounded natural language understanding
In contrast to classical cognitive science which studied brains in isolation, ecological approaches focused on the role of the body and environment in shaping cognition. Similarly, in this thesis we adopt an ecological approach to grounded natural language understanding (NLU) research. Grounded language understanding studies language understanding systems situated in the context of events, actions and precepts in naturalistic/simulated virtual environments. Where classic research tends to focus on designing new models and optimization methods while treating environments as given, we explore the potential of environment design for improving data collection and model development. We developed novel training and annotation approaches for procedural text understanding based on text-based game environments. We also drew upon embodied cognitive linguistics literature to propose a roadmap for grounded NLP research, and to inform the development of a new benchmark for measuring the progress of large language models on challenging commonsense reasoning tasks. We leveraged the richer supervision provided by text-based game environments to develop Breakpoint Transformers, a novel approach to modeling intermediate semantic information in long narrative or procedural texts. Finally, we integrated theories on the role of environments in collective human intelligence to propose a design for AI-augmented "social thinking environments" for knowledge workers like scientists.
GitHub - flairNLP/flair: A very simple framework for state-of-the-art Natural Language Processing (NLP)
Developed by Humboldt University of Berlin and friends. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS), special support for biomedical data, sense disambiguation and classification, with support for a rapidly growing number of languages. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings, BERT embeddings and ELMo embeddings. Our framework builds directly on PyTorch, making it easy to train your own models and experiment with new approaches using Flair embeddings and classes. New: Most Flair sequence tagging models (named entity recognition, part-of-speech tagging etc.) are now hosted on the HuggingFace model hub!
Hugging Face: State-of-the-Art Natural Language Processing in ten lines of TensorFlow 2.0
Training with a strategy gives you better control over what happens during the training. By switching between strategies, the user can select the distributed fashion in which the model is trained: from multi-GPUs to TPUs. As of the time of writing, TPUStrategy is the only surefire way to train a model on a TPU using TensorFlow 2. Building a custom loop using a strategy makes even more sense in that regard, as strategies may easily be switched around and training on multi-GPU would require practically no code change. Building a custom loop requires a bit of work to set-up, therefore the reader is advised to open the following colab notebook to have a better grasp of the subject at hand. It does not go into the detail of tokenization as the first colab has done, but it shows how to build an input pipeline that will be used by the TPUStrategy.
AI Analysis Gives Guidance to Crisis Counselors
A study by Cornell University researchers and the Crisis Text Line crisis-counselor platform described how volunteer crisis counselors' use of language evolves. The team used state-of-the-art natural language processing to learn that the language employed by counselors systematically changes, based on their training and empathy for callers in distress, giving rise to unique voices for calming those distressed individuals. The researchers analyzed more than 1 million anonymized texts from about 3,500 counselors on the Crisis Text Line. Crisis Text Line's Robert Filbin said the study' insights will help the platform train and guide crisis counselors. Cornell's Cristian Danescu-Niculescu-Mizil said, "This is an example of how natural language processing techniques can assist the development of skills in conversation-heavy professions."